Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Software defect number prediction method based on data oversampling and ensemble learning
JIAN Yiheng, YU Xiao
Journal of Computer Applications    2018, 38 (9): 2637-2643.   DOI: 10.11772/j.issn.1001-9081.2018020507
Abstract733)      PDF (1349KB)(362)       Save
Predicting the number of the defects in software modules can help testers pay more attention to the modules with more defects, thus reasonably allocating limited testing resource. Focusing on the issue that software defect datasets are imbalanced, a method based on oversampling and ensemble learning (abbreviate as SMOTENDEL) for predicting the number of defects was proposed in this paper. Firstly, n balanced datasets were obtained by oversampling the original software defect dataset n times. Then, n individual models for predicting the number of defects were trained on the n balanced datasets using regression algorithms. Finally, the n individual models were combined to obtain an ensemble prediction model, and the ensemble prediction model was used to predict the number of defects in a new software module. The experimental results show that SMOTENDEL has better performance than the original prediction method. When using Decision Tree Regression (DTR), Bayesian Ridge Regression (BRR) and Linear Regression (LR) as the individual prediction model, the improvement is 7.68%, 3.31% and 3.38%, respectively.
Reference | Related Articles | Metrics